Evaluating multicore algorithms on the unified memory model
نویسندگان
چکیده
One of the challenges to achieving good performance on multicore architectures is the effective utilization of the underlying memory hierarchy. While this is an issue for single-core architectures, it is a critical problem for multicore chips. In this paper, we formulate the unified multicore model (UMM) to help understand the fundamental limits on cache performance on these architectures. The UMM seamlessly handles different types of multiple-core processors with varying degrees of cache sharing at different levels. We demonstrate that our model can be used to study a variety of multicore architectures on a variety of applications. In particular, we use it to analyze an option pricing problem using the trinomial model and develop an algorithm for it that has near-optimal memory traffic between cache levels. We have implemented the algorithm on a two Quad-Core Intel Xeon 5310 1.6 GHz processors (8 cores). It achieves a peak performance of 19.5 GFLOPs, which is 38% of the theoretical peak of the multicore system. We demonstrate that our algorithm outperforms compiler-optimized and auto-parallelized code by a factor of up to 7.5.
منابع مشابه
A Unified Approach for Design of Lp Polynomial Algorithms
By summarizing Khachiyan's algorithm and Karmarkar's algorithm forlinear program (LP) a unified methodology for the design of polynomial-time algorithms for LP is presented in this paper. A key concept is the so-called extended binary search (EBS) algorithm introduced by the author. It is used as a unified model to analyze the complexities of the existing modem LP algorithms and possibly, help ...
متن کاملMemory Hierarchy Issues in Multicore Architectures
Multicore architectures have introduced a new problem to parallel computing, namely, the management of hierarchical parallel caches. As with other architectures, a cache structure is designed to simulate a fast common memory. To address the challenge of management of these caches we a) introduce the Unified Multicore Model (UMM), a hierarchical arrangement of caches, b) present a general strate...
متن کاملEvaluating the Portability of UPC to the Cell Broadband Engine
Unified Parallel C (UPC) is a parallel programming language for distributed as well as shared memory systems. The Cell Broadband Engine (Cell BE) is a state of the art multicore processor. In this paper we evaluate the opportunities and pitfalls of implementing the Berkeley UPC runtime system API for the Cell BE and thus bringing UPC to Cell. We propose a mapping of the distributed shared memor...
متن کاملGraph Algorithms for Multicores with Multilevel Caches
Historically, the primary model of computation employed in the design and analysis of algorithms has been the sequential RAM model. However, recent developments in computer architecture have reduced the efficacy of the sequential RAM model for algorithmic development. In response, theoretical computer scientists have developed models of computation which better reflect these modern architecture...
متن کاملPerformance Evaluation of MPI, UPC and OpenMP on Multicore Architectures
The current trend to multicore architectures underscores the need of parallelism. While new languages and alternatives for supporting more efficiently these systems are proposed, MPI faces this new challenge. Therefore, up-to-date performance evaluations of current options for programming multicore systems are needed. This paper evaluates MPI performance against Unified Parallel C (UPC) and Ope...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Scientific Programming
دوره 17 شماره
صفحات -
تاریخ انتشار 2009